Local Word Vectors Guide Keyphrase Extraction
نویسندگان
چکیده
Word vector representation techniques, built on word-word co-occurrence statistics, often provide representations that decode the differences in meaning between various words. This significant fact is a powerful tool that can be exploited to a great deal of natural language processing tasks. In this work, we propose a simple and efficient unsupervised approach for keyphrase extraction, called Reference Vector Algorithm (RVA) which utilizes a local word vector representation by applying the GloVe method in the context of one scientific publication at a time. Then, the mean word vector (reference vector) of the article’s abstract guides the candidate keywords’ selection process, using the cosine similarity. The experimental results that emerged through a thorough evaluation process show that our method outperforms the state-of-the-art methods by providing high quality keyphrases in most cases, proposing in this way an additional mode for the exploitation of GloVe word vectors.
منابع مشابه
Local Word Vectors Guiding Keyphrase Extraction
Automated keyphrase extraction is a fundamental textual information processing task concerned with the selection of representative phrases from a document that summarize its content. This work presents a novel unsupervised method for keyphrase extraction, whose main innovation is the use of local word embeddings (in particular GloVe vectors), i.e. embeddings trained from the single document und...
متن کاملCorpus-independent Generic Keyphrase Extraction Using Word Embedding Vectors
Keyphrase extraction from a given document is a difficult task that requires not only local statistical information but also extensive background knowledge. In this paper, we propose a graph-based ranking approach that uses information supplied by word embedding vectors as the background knowledge. We first introduce a weighting scheme that computes informativeness and phraseness scores of word...
متن کامل273. Task 5. Keyphrase Extraction Based on Core Word Identification and Word Expansion
This paper provides a description of the Hong Kong Polytechnic University (PolyU) System that participated in the task #5 of SemEval-2, i.e., the Automatic Keyphrase Extraction from Scientific Articles task. We followed a novel framework to develop our keyphrase extraction system, motivated by differentiating the roles of the words in a keyphrase. We first identified the core words which are de...
متن کاملKeyphrase Extraction and Grouping Based on Association Rules
Keyphrases are important in capturing the content of a document and thus useful for many natural language processing tasks such as Information Retrieval, Document Classification, and Text Summarization. Keyphrase extraction aims to identify multi-word sequences from a collection of documents that more or less correspond to keyphrases. In this paper, we propose a new method for keyphrase extract...
متن کاملReducing Over-generation Errors for Automatic Keyphrase Extraction using Integer Linear Programming
We introduce a global inference model for keyphrase extraction that reduces overgeneration errors by weighting sets of keyphrase candidates according to their component words. Our model can be applied on top of any supervised or unsupervised word weighting function. Experimental results show a substantial improvement over commonly used word-based ranking approaches.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.07503 شماره
صفحات -
تاریخ انتشار 2017